Q. While retrofeiting an existing application to support MBCS, what is the guideline for changing the size of existing database fields ? A. On average, doubling the number of bytes would roughly hold the same number of character. Q. Does libcur in AIX support bidirectional languages, such as Hebrew and Arabics ? A. No Q. What is the extent of XPG4 support provided in LEX ? A. In order to get the wide char functions in the final code you must let lex know that you want to use them by setting the corresponding multibyte table sizes to something greater than zero. At the end of this note is an excerpt from the 4.1 man page on lex which describes how to do this. One misconception about XPG4 support for lex is that you can create a lex program and then run it in multiple locales. However this is not possible as XPG4 states, "The lex utility is not fully internationalised in its treatment of regular expressions in the lex source code or the generated lexical analyser. It would seem desirable to have the lexical analyser interpret the regular expressions given in the lex source according to the environment specified when the lexical analyser is executed, but this is not possible with the current lex technology." In other words the lexical analyser must be generated and run in the same locale if you want to match anything that is locale specific. If the user only uses the portable character set then lex should work in the other locales. The "." character can then be used to match everything else. However, the user should remember that range expressions are not portable between locales. LEX MAN PAGE ------------ In the definitions section, you can set table sizes for the resulting finite state machine. The default sizes are large enough for small programs. You may want to set larger sizes for more complex programs. %an Number of transitions is n (default 5000) %en Number of parse tree nodes is n (default 2000) %hn Number of multibyte character output slots (default is 0) %kn Number of packed character classes (default 1000) %mn Number of multibyte "character class" character output slots (default is 0) %nn Number of states is n (default 2500) %on Number of output slots (default 5000, minimum 257) %pn Number of positions is n (default 5000) %vp Percentage of slots vacant in the hash tables controlled by %h and %m (default 20, range 0 <= P < 100) %zn Number of multibyte character class output slots (default 0) If multibyte characters appear in extended regular expression strings, you may need to reset the output array size with the %o argument (possibly to array sizes in the range 10,000 to 20,000). This reset reflects the much larger number of characters relative to the number of single-byte characters. If multibyte characters appear in extended regular expressions, you must set the multibyte hash table sizes with the %h and %m arguments to sizes greater than the total number of multibyte characters contained in the lex file. If no multibyte characters appear in extended regular expressions but you want '.' to match multibyte characters, you must set %z greater than zero. Similarly, for inverse character classes (for example, [^abc]) to match multibyte characters, you must set both %h and %m greater than zero. When using multibyte characters, the lex.yy.c file must be compiled with the -qmbcs compiler option. Q. When using XFontSet, how is "point size" specified ? A. An Internationalized application needs to consider the fact that multiple fonts may be required to render all characters of the locale using one or more fonts whose encodings may be different than the code set of the locale. XfontSet is used to specify a list of XLFD names, where only the base characteristics, such as point size, style and weight, are significant. the encoding of the desired fonts is determined from the locale, and any charsets specified in the XLFD base name lists are ignored and users need only concentrate on specifying the base characteristics. Here is an example of the fontset specification: -dt-*-medium-*-24-*-m-* Q. What does the achronym "XOJIG" stand for ? A. X-Open Joint Internationalization group. It is the locale registry at X-Open. Q. How do you do I18N character manipulation in Fortran ? A. It has do be done thru the interface to "C".